AITopics | cc-by 4

Collaborating Authors

cc-by 4

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proxy-Based Approximation of Shapley and Banzhaf Interactions

Thies, Santo M. A. R., Baniecki, Hubert, Witter, R. Teal, Hüllermeier, Eyke, Muschalik, Maximilian, Fumagalli, Fabian

arXiv.org Machine LearningMay-25-2026

Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2605.22738

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.93)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

f94d5edb5c01715d879693ddbfdc1b98-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-12-2026, 22:48:00 GMT

dataset, model zoo, zoo, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

LAD-BNet: Lag-Aware Dual-Branch Networks for Real-Time Energy Forecasting on Edge Devices

Lignier, Jean-Philippe

arXiv.org Machine LearningNov-17-2025

Real-time energy forecasting on edge devices represents a major challenge for smart grid optimization and intelligent buildings. We present LAD-BNet (Lag-Aware Dual-Branch Network), an innovative neural architecture optimized for edge inference with Google Coral TPU. Our hybrid approach combines a branch dedicated to explicit exploitation of temporal lags with a Temporal Convolutional Network (TCN) featuring dilated convolutions, enabling simultaneous capture of short and long-term dependencies. Tested on real energy consumption data with 10-minute temporal resolution, LAD-BNet achieves 14.49% MAPE at 1-hour horizon with only 18ms inference time on Edge TPU, representing an 8-12 x acceleration compared to CPU. The multi-scale architecture enables predictions up to 12 hours with controlled performance degradation. Our model demonstrates a 2.39% improvement over LSTM baselines and 3.04% over pure TCN architectures, while maintaining a 180MB memory footprint suitable for embedded device constraints. These results pave the way for industrial applications in real-time energy optimization, demand management, and operational planning.

artificial intelligence, cc-by 4, machine learning, (16 more...)

arXiv.org Machine Learning

2511.1068

Genre: Research Report (0.40)

Industry:

Information Technology (1.00)
Energy > Power Industry (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition

Semnani, Sina J., Zhang, Han, He, Xinyan, Tekgürler, Merve, Lam, Monica S.

arXiv.org Artificial IntelligenceSep-25-2025

Accurate text recognition for historical documents can greatly advance the study and preservation of cultural heritage. Existing vision-language models (VLMs), however, are designed for modern, standardized texts and are not equipped to read the diverse languages and scripts, irregular layouts, and frequent degradation found in historical materials. This paper presents CHURRO, a 3B-parameter open-weight VLM specialized for historical text recognition. The model is trained on CHURRO-DS, the largest historical text recognition dataset to date. CHURRO-DS unifies 155 historical corpora comprising 99,491 pages, spanning 22 centuries of textual heritage across 46 language clusters, including historical variants and dead languages. We evaluate several open-weight and closed VLMs and optical character recognition (OCR) systems on CHURRO-DS and find that CHURRO outperforms all other VLMs. On the CHURRO-DS test set, CHURRO achieves 82.3% (printed) and 70.1% (handwritten) normalized Levenshtein similarity, surpassing the second-best model, Gemini 2.5 Pro, by 1.4% and 6.5%, respectively, while being 15.5 times more cost-effective. By releasing the model and dataset, we aim to enable community-driven research to improve the readability of historical texts and accelerate scholarship.

large language model, machine learning, pattern recognition, (22 more...)

arXiv.org Artificial Intelligence

2509.19768

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East (0.67)

Genre:

Research Report (1.00)
Overview (0.92)

Industry:

Health & Medicine (1.00)
Media (0.69)
Law (0.67)
Government > Military (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

f94d5edb5c01715d879693ddbfdc1b98-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-19-2025, 21:07:07 GMT

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Quantum Adaptive Self-Attention for Quantum Transformer Models

Chen, Chi-Sheng, Kuo, En-Jui

arXiv.org Artificial IntelligenceJun-3-2025

Transformer models have revolutionized sequential learning across various domains, yet their self-attention mechanism incurs quadratic computational cost, posing limitations for real-time and resource-constrained tasks. To address this, we propose Quantum Adaptive Self-Attention (QASA), a novel hybrid architecture that enhances classical Transformer models with a quantum attention mechanism. QASA replaces dot-product attention with a parameterized quantum circuit (PQC) that adaptively captures inter-token relationships in the quantum Hilbert space. Additionally, a residual quantum projection module is introduced before the feedforward network to further refine temporal features. Our design retains classical efficiency in earlier layers while injecting quantum expressiveness in the final encoder block, ensuring compatibility with current NISQ hardware. Experiments on synthetic time-series tasks demonstrate that QASA achieves faster convergence and superior generalization compared to both standard Transformers and reduced classical variants. Preliminary complexity analysis suggests potential quantum advantages in gradient computation, opening new avenues for efficient quantum deep learning models.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.05336

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

Mulliqi, Nita, Blilie, Anders, Ji, Xiaoyi, Szolnoky, Kelvin, Olsson, Henrik, Boman, Sol Erika, Titus, Matteo, Gonzalez, Geraldine Martinez, Mielcarz, Julia Anna, Valkonen, Masi, Gudlaugsson, Einar, Kjosavik, Svein R., Asenjo, José, Gambacorta, Marcello, Libretti, Paolo, Braun, Marcin, Kordek, Radzislaw, Łowicki, Roman, Hotakainen, Kristina, Väre, Päivi, Pedersen, Bodil Ginnerup, Sørensen, Karina Dalsgaard, Ulhøi, Benedicte Parm, Ruusuvuori, Pekka, Delahunt, Brett, Samaratunga, Hemamali, Tsuzuki, Toyonori, Janssen, Emilius A. M., Egevad, Lars, Eklund, Martin, Kartasalo, Kimmo

arXiv.org Artificial IntelligenceMar-3-2025

The role of artificial intelligence (AI) in pathology has evolved from aiding diagnostics to uncovering predictive morphological patterns in whole slide images (WSIs). Recently, foundation models (FMs) leveraging self-supervised pre-training have been widely advocated as a universal solution for diverse downstream tasks. However, open questions remain about their clinical applicability and generalization advantages over end-to-end learning using task-specific (TS) models. Here, we focused on AI with clinical-grade performance for prostate cancer diagnosis and Gleason grading. We present the largest validation of AI for this task, using over 100,000 core needle biopsies from 7,342 patients across 15 sites in 11 countries. We compared two FMs with a fully end-to-end TS model in a multiple instance learning framework. Our findings challenge assumptions that FMs universally outperform TS models. While FMs demonstrated utility in data-scarce scenarios, their performance converged with - and was in some cases surpassed by - TS models when sufficient labeled training data were available. Notably, extensive task-specific training markedly reduced clinically significant misgrading, misdiagnosis of challenging morphologies, and variability across different WSI scanners. Additionally, FMs used up to 35 times more energy than the TS model, raising concerns about their sustainability. Our results underscore that while FMs offer clear advantages for rapid prototyping and research, their role as a universal solution for clinically applicable medical AI remains uncertain. For high-stakes clinical applications, rigorous validation and consideration of task-specific training remain critically important. We advocate for integrating the strengths of FMs and end-to-end learning to achieve robust and resource-efficient AI pathology solutions fit for clinical use.

cohort, license, preprint, (14 more...)

arXiv.org Artificial Intelligence

2502.21264

Country:

Europe > Norway > Western Norway > Rogaland > Stavanger (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Poland > Łódź Province > Łódź (0.04)
(22 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Biopsy (0.90)
Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Quantum Recurrent Neural Networks with Encoder-Decoder for Time-Dependent Partial Differential Equations

Chen, Yuan, Khaliq, Abdul, Furati, Khaled M.

arXiv.org Artificial IntelligenceFeb-18-2025

Quantum Recurrent Neural Networks with Encoder-Decoder for Time-Dependent Partial Differential Equations Yuan Chen 1, Abdul Khaliq 1,2, and Khaled M. Furati 3 1 Computational and Data Science Program, Middle Tennessee State University, Murfreesboro, 37132, TN, USA 2 Department of Mathematical Science, Middle Tennessee State University, Murfreesboro, 37132, TN, USA 3 Department of Mathematics, King Fahd University of Petroleum & Minerals, Dhahran, 31261, Saudi Arabia Nonlinear time-dependent partial differential equations are essential in modeling complex phenomena across diverse fields, yet they pose significant challenges due to their computational complexity, especially in higher dimensions. This study explores Quantum Recurrent Neural Networks within an encoder-decoder framework, integrating V ariational Quantum Circuits into Gated Recurrent Units and Long Short-T erm Memory networks. W e evaluate the algorithms on the Hamilton-Jacobi-Bellman equation, Burgers' equation, the Gray-Scott reaction-diffusion system, and the three dimensional Michaelis-Menten reaction-diffusion equation. The results demonstrate the superior performance of the quantum-based algorithms in capturing nonlinear dynamics, handling high-dimensional spaces, and providing stable solutions, highlighting their potential as an innovative tool in solving challenging and complex systems. 1 Introduction Partial differential equations (PDEs) are fundamental mathematical tools for modeling diverse phenomena in many fields such as physics, biology, chemistry, and economics. However, for many complex and high-dimensional PDEs, analytical solutions are often unattainable due to Yuan Chen: yc3y@mtmail.mtsu.edu To address this, numerical methods such as the finite-difference method (FDM) [1], finite-element method (FEM) [2], and finite-volume method (FVM) [3] have been developed to approximate solutions. These techniques have been effective in a variety of applications but face limitations in computational complexity, stability, and scalability, especially when applied to non-linear or high-dimensional problems.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.1337

Country:

North America > United States > Tennessee (0.44)
Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.24)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Gaido, Marco, Papi, Sara, Bentivogli, Luisa, Brutti, Alessio, Cettolo, Mauro, Gretter, Roberto, Matassoni, Marco, Nabih, Mohamed, Negri, Matteo

arXiv.org Artificial IntelligenceOct-1-2024

The rise of foundation models (FMs), coupled with regulatory efforts addressing their risks and impacts, has sparked significant interest in open-source models. However, existing speech FMs (SFMs) fall short of full compliance with the open-source principles, even if claimed otherwise, as no existing SFM has model weights, code, and training data publicly available under open-source terms. In this work, we take the first step toward filling this gap by focusing on the 24 official languages of the European Union (EU). We collect suitable training data by surveying automatic speech recognition datasets and unlabeled speech corpora under open-source compliant licenses, for a total of 950k hours. Additionally, we release automatic transcripts for 441k hours of unlabeled data under the permissive CC-BY license, thereby facilitating the creation of open-source SFMs for the EU languages.

dataset, license, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2410.01036

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(15 more...)

Genre: Research Report (0.50)

Industry:

Government (0.68)
Information Technology (0.68)
Law (0.66)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Anomaly Detection from a Tensor Train Perspective

Ali, Alejandro Mata, de Leceta, Aitor Moreno Fdez., Rubio, Jorge López

arXiv.org Artificial IntelligenceSep-23-2024

We present a series of algorithms in tensor networks for anomaly detection in datasets, by using data compression in a Tensor Train representation. These algorithms consist of preserving the structure of normal data in compression and deleting the structure of anomalous data. The algorithms can be applied to any tensor network representation. We test the effectiveness of the methods with digits and Olivetti faces datasets and a cybersecurity dataset to determine cyber-attacks.

positive rate true positive rate, representation, roc curve, (16 more...)

arXiv.org Artificial Intelligence

2409.1503

Country: Europe > Spain (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.69)
Government > Military > Cyberwarfare (0.69)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback